Text-constrained speaker recognition on a text-independent task
نویسندگان
چکیده
We present an approach to speaker recognition in the textindependent domain of conversational telephone speech using a text-constrained system designed to employ select highfrequency keywords in the speech stream. The system uses speaker word models generated via Hidden Markov Models (HMMs) — a departure from the traditional Gaussian Mixture Model (GMM) approach dominant in text-independent work, but commonly employed in text-dependent systems — with the expectation that HMMs take greater advantage of sequential information and support more detailed modeling which could be used to aid recognition. Even with a keyword inventory that covers a mere 10% of the word tokens and a system that does not yet incorporate many standard speaker recognition normalization schemes, this approach is already achieving equal error rates of 1% on NIST’s 2001 Extended Data task.
منابع مشابه
Closed-Set Speaker Identification Based on a Single Word Utterance: An Evaluation of Alternative Approaches
The problem of closed-set speaker identification based on a single spoken word from a limited vocabulary is relevant to several current and futuristic interactive multimedia applications. In this paper, we evaluate the effectiveness of several potential solutions using an isolated word speech corpus. In addition to evaluating the text-dependent and text-constrained variants of the Gaussian Mixt...
متن کاملPO1-3 Text-Independent Speaker Recognition for Ubiquitous Robot Companion
This paper describes text-independent speaker recognition system which is basic and essential for interaction between users and the robot in Ubiquitous Robot Companion environment. For comfortable interaction between users and the robot, we implement an online speaker enrollment module which trains a GMM for each speaker model and text-independent speaker recognition module which consists of te...
متن کاملDNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances
We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...
متن کاملText-constrained Speaker Recognition Using Hidden Markov Models
This paper presents a possible application of a text-dependent speaker recognition system within the unconstrained domain of telephone conversation speech, as contained in the Switchboard I corpus. The system utilizes word HMMs to generate likelihood scores for key words among the backchannel, filled pause, and discourse marker categories. Results on tests using a variant of the NIST 2001 exten...
متن کاملPhonetic, idiolectal and acoustic speaker recognition
This paper describes a text-independent speaker recognition system that achieves an equal error rate of less than 1% by combining phonetic, idiolect, and acoustic features. The phonetic system is a novel language-independent speakerrecognition system based on differences among speakers in dynamic realization of phonetic features (i.e., pronunciation), rather than spectral differences in voice q...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004